Add maxfailures option to limit test failures before stopping by VanitasCodes · Pull Request #61560 · JuliaLang/julia

VanitasCodes · 2026-04-11T02:07:01Z

This came out of the triage discussion on #61483, where @oscardssmith suggested that instead of changing the default behavior of @testset, we should add a maxfailures option that test runners can use to control how many failures are tolerated before stopping execution. This PR implements the Test stdlib side of that.

There's a global atomic counter that tracks failures and errors across testsets, and a configurable limit. When the count hits the limit, a MaxFailuresError is thrown and execution stops. The default limit is typemax(Int), so existing behavior is completely unchanged unless you explicitly call set_max_failures.

Four new functions are exported from Test:

set_max_failures(n) - set the limit
get_max_failures() - read the current limit
get_failure_count() - read the current failure count
reset_failure_count() - reset the counter to zero

The MaxFailuresError integrates with the existing is_failfast_error machinery, so it propagates correctly through nested testsets the same way FailFastError does. If both failfast and maxfailures are set, failfast takes precedence since it's checked first in record.

The intended usage is for something like Pkg.test(; maxfailures=10) to call Test.set_max_failures(10) before running tests, which would be a follow-up PR.

VanitasCodes · 2026-04-11T02:08:08Z

@oscardssmith Is this along the lines of what you had in mind?

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a maxfailures mechanism to Julia’s Test stdlib to stop a test run after a configurable number of failures/errors, without changing default @testset behavior.

Changes:

Introduces global atomic failure counter + limit, plus exported setters/getters/reset helpers.
Adds MaxFailuresError and integrates it with existing failfast propagation/printing paths.
Adds stdlib tests validating default behavior, stopping at N failures, counting errors, and invalid input.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File	Description
stdlib/Test/src/Test.jl	Implements global max-failures tracking, exports API, adds `MaxFailuresError`, and wires it into `record` and top-level printing.
stdlib/Test/test/runtests.jl	Adds integration-style tests that spawn a Julia process to verify max-failures behavior and messaging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-11T02:15:31Z

+            result = read(pipeline(ignorestatus(cmd), stderr=devnull), String)
+            @test occursin("Max failures reached: 1", result)
+            @test occursin("First", result)
+            @test !occursin(r"Test Summary:.*\n.*Second", result)


These regexes assume \n line endings. To make the tests more robust across platforms/environments, consider matching \r?\n (or otherwise avoiding hard-coding newline style) so the assertion doesn’t become Windows-sensitive.

Copilot · 2026-04-11T02:15:32Z

+            result = read(pipeline(ignorestatus(cmd), stderr=devnull), String)
+            @test occursin("Max failures reached: 2", result)
+            @test occursin("First", result)
+            @test !occursin(r"Test Summary:.*\n.*Second", result)


These regexes assume \n line endings. To make the tests more robust across platforms/environments, consider matching \r?\n (or otherwise avoiding hard-coding newline style) so the assertion doesn’t become Windows-sensitive.

Copilot · 2026-04-11T02:15:32Z

+            result = read(pipeline(ignorestatus(cmd), stderr=devnull), String)
+            @test occursin("Max failures reached: 1", result)
+            @test occursin("First", result)
+            @test !occursin(r"Test Summary:.*\n.*Second", result)


These regexes assume \n line endings. To make the tests more robust across platforms/environments, consider matching \r?\n (or otherwise avoiding hard-coding newline style) so the assertion doesn’t become Windows-sensitive.

oscardssmith · 2026-04-11T02:44:17Z

Overall, I think this is the right direction (although seeing the changes in Pkg would be useful to see this in use).

DilumAluthge · 2026-04-11T02:46:45Z

Thanks for putting this together!

Four new functions are exported from Test:

Do those functions need to be exported (or public)?

VanitasCodes · 2026-04-11T13:09:56Z

@oscardssmith I have a few questions before I revise.

You mentioned the default should be 0 for backward compatibility. I originally went with typemax(Int) to mean "unlimited" so existing behavior stays the same, but I think you're suggesting 0 should mean "disabled/not counting" instead of "stop immediately"? That makes more sense. So the semantics would be 0 means the feature is off, and you'd call set_max_failures(10) to enable it with a limit of 10. Is that what you had in mind?

You're completely right about the atomics. I was overthinking the concurrency case, but tests run sequentially in a single process anyway. I'll switch to a simple Ref{Int} or just a module global.

For the API, if we make the limit nonatomic and just use direct access like global_failure_limit = n, do we still want helper functions at all, or should Pkg just set Test.global_failure_limit directly? I can see it going either way. The functions feel a bit over-engineered for what they do.

Should I draft a companion PR for Pkg showing how Pkg.test(; maxfailures=10) would use this, or wait until this settles?

VanitasCodes · 2026-04-11T13:11:30Z

@DilumAluthge I originally exported them thinking they might be useful for custom test runners, but @oscardssmith pointed out they probably shouldn't be. I think you're right that they should either be public (documented but not exported) or just kept internal and accessed as Test.set_max_failures(). What do you think makes more sense? I'm leaning toward not exporting them and letting Pkg access them via the Test. prefix.

DilumAluthge · 2026-04-12T00:16:43Z

just kept internal and accessed as Test.set_max_failures()

Keeping them internal (non-public) sounds good to me!

VanitasCodes · 2026-04-13T16:09:49Z

@oscardssmith I've pushed an updated version based on your feedback. The atomics are gone and the globals are now plain Ref{Int} values that Pkg can set directly via Test.global_failure_limit[] = n and reset via Test.global_failure_count[] = 0. I also removed all the helper functions and the export line since they're no longer needed. Let me know if this is closer to what you had in mind or if you'd like any further changes.

@DilumAluthge Updated the PR to remove the exports entirely as you and @oscardssmith suggested. The globals are now accessed directly via Test.global_failure_limit[] and Test.global_failure_count[], which keeps things simple and internal.

VanitasCodes · 2026-04-13T19:36:52Z

@oscardssmith The count is now a Threads.Atomic{Int} while the limit stays a plain Ref since it's only written once before the run starts.

VanitasCodes · 2026-04-13T21:03:37Z

@oscardssmith Is there anything else you'd like me to address before this is ready?

oscardssmith · 2026-04-13T21:31:08Z

Seems ready to me. I'd like to see the Pkg side PR before merging to get a real user of this, but the code here looks correct.

VanitasCodes · 2026-04-13T22:34:12Z

@oscardssmith I'll put together a draft PR for the Pkg side showing the integration. I'll link it here once it's ready.

DilumAluthge · 2026-04-15T19:12:54Z

@VanitasCodes Just wanted to check- have you used any AI tools when working on this PR? Either to generate the code or to write the PR description or comments?

VanitasCodes · 2026-04-15T19:22:03Z

@VanitasCodes Just wanted to check- have you used any AI tools when working on this PR? Either to generate the code or to write the PR description or comments?

I have had some assistance for the code, yes. But it was majorly used for understanding and navigating through the codebase more effectively.

VanitasCodes · 2026-04-16T01:26:25Z

@oscardssmith quick update before I push. While testing locally I found that when MaxFailuresError is thrown inside include(), it gets wrapped in a LoadError, so the catch blocks were never seeing it as a MaxFailuresError and were printing "Fail-fast enabled" instead of "Max failures reached". Fixed it by unwrapping first with real_err = err isa LoadError ? err.error : err before the check. The correct message now shows up.

Also while trying to test the Pkg integration I realized Pkg isn't actually tracked in the Julia repo so the Pkg.test(; maxfailures=n) side would need to be a separate PR on JuliaLang/Pkg.jl. I have it written and tested locally but wanted to confirm that's the right place before I open anything there. Is that how it's supposed to work?

oscardssmith · 2026-04-16T01:50:56Z

Is that how it's supposed to work?

Yeah, that's right.

VanitasCodes · 2026-04-16T02:01:22Z

@oscardssmith just pushed the LoadError unwrapping fix. Can you have a look at it and lemme know if the changes look right?

Copilot AI review requested due to automatic review settings April 11, 2026 02:07

Copilot AI reviewed Apr 11, 2026

View reviewed changes

oscardssmith reviewed Apr 11, 2026

View reviewed changes